Persistent Su x Trees and Su x BinarySearch Trees as DNA Sequence
نویسندگان
چکیده
We constructed, stored on disk and reused suux trees and suux binary search trees for C. elegans chromosomes, and measured their performance using orthogonal persistence for Java (PJama). We compare our implementation with the performance of a transient 1 suux tree, and discuss the suitability of such indexes in pursuing our long-term goal of indexing large genomes. We identify the potential for persistent DNA indexes in a variety of biological and medical contexts, and believe they will complement current string searching methods based on transient data structures.
منابع مشابه
ACCEPTED FOR PhDOO WORKSHOP, ECOOP'00 PJama Stores and Su x Tree Indexing for Bioinformatics Applications
Motivation: The biggest public domain biological sequence archive exceeds 6Gbases of DNA and much larger sequence amounts are held by industrial labs. The amount of data is growing exponentially but sequence search technologies still rely on at le storage and high-throughput parallel computers reading all data sequentially to nd sequence similarities or patterns. This issue is not addressed by ...
متن کاملOptimal Su x Tree Construction with Large
The su x tree of a string is the fundamental data structure of combinatorial pattern matching. In this paper, we present a novel, deterministic algorithm for the construction of su x trees. We settle the main open problem in the construction of su x trees: we build su x trees in linear time for integer alphabet.
متن کاملAugmenting Su x Trees, with Applications
Information retrieval and data compression are the two main application areas where the rich theory of string algorithmics plays a fundamental role In this paper we consider one algorithmic problem from each of these areas and present highly e cient linear or near linear time algorithms for both problems Our algorithms rely on augmenting the su x tree a fundamental data structure in string algo...
متن کاملGeneralizations of suffix arrays to multi-dimensional matrices
We propose multi-dimensional index data structures that generalize su x arrays to square matrices and cubic matrices. Giancarlo proposed a two-dimensional index data structure, the Lsu x tree, that generalizes su x trees to square matrices. However, the construction algorithm for Lsu x trees maintains complicated data structures and uses a large amount of space. We present simple and practical ...
متن کاملThe enhanced su x array and its applications to genome analysis
In large scale applications as computational genome analysis, the space requirement of the su x tree is a severe drawback. In this paper, we present a uniform framework that enables us to systematically replace every string processing algorithm that is based on a bottomup traversal of a su x tree by a corresponding algorithm based on an enhanced su x array (a su x array enhanced with the lcp-ta...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000